Implementing the PPM data compression scheme
نویسنده
چکیده
The “Prediction by Partial Matching” (PPM) data compression algorithm developed by Cleary and Witten is capable of very high compression rates, encoding English text in as little as 2.2 bits/character. Here it is shown that the estimates made by Cleary and Witten of the resources required to implement the scheme can be revised to allow for a tractable and useful implementation. In particular, a variant is described that encodes and decodes at over 4 kbytes/s on a small workstation, and operates within a few hundred kilobytes of data space, but still obtains compression of about 2.4 bits/character on
منابع مشابه
Extended Application of Suffix Trees to Data Compression
A practical scheme for maintaining an index for a sliding window in optimal time and space, by use of a suffix tree, is presented. The index supports location of the longest matching substring in time proportional to the length of the match. The total time for build and update operations is proportional to the size of the input. The algorithm, which is simple and straightforward, is presented i...
متن کاملUnbounded Length Contexts for PPM
The PPM data compression scheme has set the performance standard in lossless compression of text throughout the past decade. PPM is a "nite-context statistical modelling technique that can be viewed as blending together several "xed-order context models to predict the next character in the input sequence. This paper gives a brief introduction to PPM, and describes a variant of the algorithm, ca...
متن کاملA Single Core Hardware Module of a Data Compression Scheme Using Prediction by Partial Matching Technique
Problem statement: Compression is useful because it helps reduce the consumption of expensive resources, such as hard disk space or transmission bandwidth. For effective data compression, the compression algorithm must be able to predict future data accurately in order to build a good probabilistic model for compression. Lossless compression is essential in cases where it is important that the ...
متن کاملGeneric Adaptive Syntax-Directed Compression for Mobile Code
We propose a new scheme for compressing mobile programs. Our proposal is meant as part of a larger infrastructure for code distribution and deployment. In this paper we show how to effectively compress programs on the source level by compressing abstract syntax trees (ASTs) which are equivalent to source code (modulo comments and layout). We compress ASTs by adapting the wellknown PPM (predicti...
متن کاملText Compression using Recency Rank with Context and Relation to Context Sorting, Block Sorting and PPM*
Recently block sorting compression scheme was developed and relation to statistical scheme was studied, but theoretical analysis of performance has not been studied well. Context sorting is a compression scheme based on context similarity and it is regarded as an online version of the block sorting and it is asymptotically optimal. However, the compression speed is slower and the real performan...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Communications
دوره 38 شماره
صفحات -
تاریخ انتشار 1990